Foundations

With a dataset of 5 million unique incidents of traffic slowing or stopping from the city of Louisville, We want to determine whether or not the structure of the street network corresponds to troubles during the typical morning commute. Mapping every kink in vehicle flows during the month of June, we can see that certain hot spots do concentrate around major roads—unsurprisingly. Yet this does not tell us about the structure underpinning this phenomenon. Before producing a predictive model, in the interest of extracting valuable features from the urban fabric, we want to measure the relationships between streets and test how these measures associate with traffic congestion.

Jams

Getting from point A to point B in a city does not required hundreds of turns; drivers can simplify things by hopping on an expressway then finishing the journey on back roads. The issue is an of topology. Topology, the study of spatial relations, is itself the study of how things are interact, disregarding much of what happens in between interactions. Time on the expressway is less important than the number of turns. Topology allows us to understand the fragility of urban networks (what would happen if this road was closure?), the benefits of street arrangements (is a grid or a mess of streets better?), and countless other questions about cities and their function. Topology is key to understanding the twin issues of efficiency and robustness, and thus how to trade off costs of redundancy with risks of failure.

The following tests whether graph theory can help us better understand urban networks. This study begins by transforming data grounded in geography, or positional data, into relational data, with information regarding not just where a thing is but also how it relates to other things—in this case, a street and other streets. This is done in Python, with the osmnx package. After recasting streets as nodes in a network, and their intersections as links between them, it then takes on the challenge understanding how important each street is to the functioning of the network as a whole. A naive approach may extract nodes—using ArcGIS or QGIS—at intersections, but this would be akin to taking connections in social network as nodes and the friends themselves as links. Streets interact at their intersections, but, at least to this urbanist, they are the focus—the locus of activity and the scaffold for urban life. We move through and to streets, not intersections. We know of Madison Avenue, Lombard Street, and Rodeo Drive. (An edge is a line on a graph and a node is a point; Appended code attempts to interpret the urban network in this way, with streets not as edges but as nodes. Intersections are their connections—the edges.)

How critical or central is a stretch of pavement to the functioning of the city? Centrality can consist of betweenness, closeness, and degree. If we list every pair of vertices in a network along with the shortest path between them, collating the number of times any given node is used along those paths will give you its betweenness. Considering simply the distance of one vertex to all others will give you closeness. For a vertex, counting the number of edges attached will give you its degree. For a street, this is just how many other streets touch it. By way of example, the following shows the various measures for the city of Louisville. We can see that betweenness values cut across the city while closeness values those nearest the core. In spatially constrained networks, then, the center in terms of closeness is also likely to be the center, close to the action. The center in terms of betweenness is likely to involve geographic constraints: in a city bisected by a river, the only bridge is between all the nodes on one side and all the nodes on the other. Data sources may be found in citations.

Measures

We then estimate usage by using commuting data from the Census, which provides origins and destinations and is accessible through the lehdr package in R. Using the network we derived earlier, we can then impute shortest paths—incorporating factors like road speeds—between these points. The Census estimates all commutes between tracts, so as one additional layer of imputation, we sample random points with those tracts, based on the numbers of jobs travelling from point a to point b, connecting them via a shortest path. This means that each multiline is a single commute, rather than a bundle of commutes and it adds noise to the imputation, as points from different areas of a tract—rather than its centroid—will follow different points (a noise that we believe usefully hedges against against a flawed imputation by guessing at an array of them). All of this work can be accomplished with the sf package in R, though osmnx also provides a quick and easy process.

Commutes

From there, we can bundle all usage by street segment to create weights, using the stplanr package in R. These weights constitute an input into the regression, representing where we believe the network should be be busy.

Estimates

Process

To apply rigorous statistical logic to the process, we explore the role we convert a morphological representation of streets to a topological one using data from the Census at the tigris package. The city provides an interesting mix of orders and arrangements, with a grid constituting the urban core of the city but pure desire lines, coming in from Suburban enclaves, cut into it.